565 research outputs found
An analysis of various policy instruments to reduce congestion, fuel consumption and CO2 emissions in Beijing
Using a nested multinomial logit model of car ownership and personal travel in Beijing circa 2005, this paper compares the effectiveness of different policy instruments to reduce traffic congestion and CO2 emissions. The study shows that a congestion toll is more efficient than a fuel tax in reducing traffic congestion, whereas a fuel tax is more effective as a policy instrument for reducing gasoline consumption and emissions. An improvement in car efficiency would also reduce congestion, fuel consumption, and CO2 emissions significantly; however, this policy benefits only richer households that own a car. Low-income households do better under the fuel tax policy than under the efficiency improvement and congestion toll policies. The congestion toll and fuel tax require the travel cost per mile to more than triple. The responsiveness of aggregate fuel and CO2 are, approximately, a 1 percent drop for each 10 percent rise in the money cost of a car trip.Transport Economics Policy&Planning,Airports and Air Services,Roads&Highways,Transport and Environment,Transport in Urban Areas,Urban Transport
Does Government Investment in Local Public Goods Spur Gentrification? Evidence from Beijing
In Beijing, the metropolitan government has made enormous place based investments to increase green space and to improve public transit. We examine the gentrification consequences of such public investments. Using unique geocoded real estate and restaurant data, we document that the construction of the Olympic Village and two recent major subway systems have led to increased new housing supply in the vicinity of these areas, higher local prices and an increased quantity of nearby private chain restaurants.
Self-Distillation Network with Ensemble Prototypes: Learning Robust Speaker Representations without Supervision
Training speaker-discriminative and robust speaker verification systems
without speaker labels is still challenging and worthwhile to explore. Previous
studies have noted a substantial performance disparity between self-supervised
and fully supervised approaches. In this paper, we propose an effective
Self-Distillation network with Ensemble Prototypes (SDEP) to facilitate
self-supervised speaker representation learning. A range of experiments
conducted on the VoxCeleb datasets demonstrate the superiority of the SDEP
framework in speaker verification. SDEP achieves a new SOTA on Voxceleb1
speaker verification evaluation benchmark ( i.e., equal error rate 1.94\%,
1.99\%, and 3.77\% for trial Vox1-O, Vox1-E and Vox1-H , respectively),
discarding any speaker labels in the training phase. Code will be publicly
available at https://github.com/alibaba-damo-academy/3D-Speaker.Comment: arXiv admin note: text overlap with arXiv:2211.0416
The birth of edge cities in China: measuring the spillover effects of industrial parks
From its established status as a high-tech science park in 1988, Zhongguancun has been transformed from a village to China’s “Silicon Valley”. Zhongguancun’s big success has led many Chinese local governments to embrace ‘place-based’ investments and support the building of industrial parks (special economic zones, SEZ). In fact, this is a growing global trend. A recent Economist article, reported that there are more than 4,000 SEZs (industrial parks) around the world, ranging from basic export processing zones and science parks to more high-tech economic zones
FunCodec: A Fundamental, Reproducible and Integrable Open-source Toolkit for Neural Speech Codec
This paper presents FunCodec, a fundamental neural speech codec toolkit,
which is an extension of the open-source speech processing toolkit FunASR.
FunCodec provides reproducible training recipes and inference scripts for the
latest neural speech codec models, such as SoundStream and Encodec. Thanks to
the unified design with FunASR, FunCodec can be easily integrated into
downstream tasks, such as speech recognition. Along with FunCodec, pre-trained
models are also provided, which can be used for academic or generalized
purposes. Based on the toolkit, we further propose the frequency-domain codec
models, FreqCodec, which can achieve comparable speech quality with much lower
computation and parameter complexity. Experimental results show that, under the
same compression ratio, FunCodec can achieve better reconstruction quality
compared with other toolkits and released models. We also demonstrate that the
pre-trained models are suitable for downstream tasks, including automatic
speech recognition and personalized text-to-speech synthesis. This toolkit is
publicly available at https://github.com/alibaba-damo-academy/FunCodec.Comment: 5 pages, 3 figures, submitted to ICASSP 202
Towards a System of Open Cities in China: Home Prices, FDI Flows and Air Quality in 35 Major Cities
Over the last thirty years, China's major cities have experienced significant income and population growth. Much of this growth has been fueled by urban production spurred by world demand. Using a unique cross-city panel data set, we test several hypotheses concerning the relationship between home prices, wages, foreign direct investment and ambient air pollution across major Chinese cities. Home prices are lower in cities with higher ambient pollution levels. Cities featuring higher per-capita FDI flows have lower pollution levels.
Pushing the limits of self-supervised speaker verification using regularized distillation framework
Training robust speaker verification systems without speaker labels has long
been a challenging task. Previous studies observed a large performance gap
between self-supervised and fully supervised methods. In this paper, we apply a
non-contrastive self-supervised learning framework called DIstillation with NO
labels (DINO) and propose two regularization terms applied to embeddings in
DINO. One regularization term guarantees the diversity of the embeddings, while
the other regularization term decorrelates the variables of each embedding. The
effectiveness of various data augmentation techniques are explored, on both
time and frequency domain. A range of experiments conducted on the VoxCeleb
datasets demonstrate the superiority of the regularized DINO framework in
speaker verification. Our method achieves the state-of-the-art speaker
verification performance under a single-stage self-supervised setting on
VoxCeleb. The codes will be made publicly-available
3D-Speaker: A Large-Scale Multi-Device, Multi-Distance, and Multi-Dialect Corpus for Speech Representation Disentanglement
Disentangling uncorrelated information in speech utterances is a crucial
research topic within speech community. Different speech-related tasks focus on
extracting distinct speech representations while minimizing the affects of
other uncorrelated information. We present a large-scale speech corpus to
facilitate the research of speech representation disentanglement. 3D-Speaker
contains over 10,000 speakers, each of whom are simultaneously recorded by
multiple Devices, locating at different Distances, and some speakers are
speaking multiple Dialects. The controlled combinations of multi-dimensional
audio data yield a matrix of a diverse blend of speech representation
entanglement, thereby motivating intriguing methods to untangle them. The
multi-domain nature of 3D-Speaker also makes it a suitable resource to evaluate
large universal speech models and experiment methods of out-of-domain learning
and self-supervised learning. https://3dspeaker.github.io
- …